Subband-Based Group Delay Segmentation of Spontaneous Speech into Syllable-Like Units

نویسندگان

  • T. Nagarajan
  • Hema A. Murthy
چکیده

In the development of a syllable-centric Automatic Speech Recognition (ASR) system, segmentation of the acoustic signal into syllabic units is an important stage. Although the short-term energy (STE) function contains useful information about syllable segment boundaries, it has to be processed before segment boundaries can be extracted. This paper presents a sub-band based group delay approach to segment spontaneous speech into syllable-like units. This technique exploits the additive property of the Fourier transform phase and the deconvolution property of the cepstrum to smooth the STE function of the speech signal and make it suitable for syllable boundary detection. By treating the STE function as a magnitude spectrum of an arbitrary signal, a minimum phase group delay function is derived. This group delay function is found to be a better representative of the STE function for syllable boundary detection. Although the group delay function derived from the STE function of the speech signal contains segment boundaries, the boundaries are difficult to determine in the context of long silences, semi-vowels, and fricatives. In this paper, these issues are specifically addressed and algorithms are developed to improve the segmentation performance. The speech signal is first passed through a bank of three filters, corresponding to three different spectral bands. The STE functions of these signals are computed. Using these three STE functions, three minimum phase group delay functions are derived. By combining the evidence derived from these group delay functions, the syllable boundaries are detected. Further, a multi-resolution based technique is presented to overcome the problem of shift in segment boundaries during smoothing. Experiments carried out on the Switchboard and OGI-MLTS corpora show that the error in segmentation is at most 25 ms for 67% and 76.6% of the syllable segments, respectively.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

GROUP DELAY BASED SEGMENTATION OF SPONTANEOUS SPEECH INTO SYLLABLE-LIKE UNITS T.Nagarajan,

In the development of a syllable-centric ASR system, segmentation of the acoustic signal into syllabic units is an important stage. This paper presents a minimum phase group delay based approach to segment spontaneous speech into syllable-like units. Here, three different minimum phase signals are derived from the short term energy functions of three sub-bands of speech signals, as if it were a...

متن کامل

Automatic Segmentation of Punjabi Speech into Syllable-Like Units using Group Delay A Review

The basic building blocks of a speech segmentation system are its units. Thus it’s an important stage to select appropriate units into which the continuous speech needs to be segmented. The syllable like units is found to be the better representative for Indian languages. Punjabi is the most widely used language, thus this paper describes the automatic segmentation of Punjabi speech into syllab...

متن کامل

Segmentation of speech into

In the development of a syllable-centric ASR system, segmentation of the acoustic signal into syllabic units is an important stage. This paper presents a minimum phase group delay based approach to segment spontaneous speech into syllablelike units. Here, three different minimum phase signals are derived from the short term energy functions of three sub-bands of speech signals, as if it were a ...

متن کامل

Segmentation of speech into T . Nagarajan , Hema A . Murthy an

In the development of a syllable-centric ASR system, segmentation of the acoustic signal into syllabic units is an important stage. This paper presents a minimum phase group delay based approach to segment spontaneous speech into syllablelike units. Here, three different minimum phase signals are derived from the short term energy functions of three sub-bands of speech signals, as if it were a ...

متن کامل

A syllable based continuous speech recognizer for Tamil

This paper presents a novel technique for building a syllable based continuous speech recognizer when unannotated transcribed train data is available. We present two different segmentation algorithms to segment the speech and the corresponding text into comparable syllable like units. A group delay based two level segmentation algorithm is proposed to extract accurate syllable units from the sp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • EURASIP J. Adv. Sig. Proc.

دوره 2004  شماره 

صفحات  -

تاریخ انتشار 2004